Shazam-Bot

Adith Ramachandran (ar729), Elias Hanna (eph48), Annie Chang (ac2329)

Introduction

In this project, we designed and built a Raspberry Pi based robot that can sense the audio source in a certain distance, approach to the audio source, and analyze the audio’s frequency distribution. The project is formed by two major parts: navigation/mobilization and signal processing. The microphone installed on the robot provides the volume and the frequency of the sound; the volume is used to determine the direction to move towards while the frequency goes through FFT to be displayed on the piTFT screen.

Objective

Design a Raspberry Pi-based robot that can:

Find the audio source with an analog microphone and Approach it by servo-motor based car
Analyze the audio with FFT algorithm
Display the FFT graph on the piTFT screen

Design

Hardware

Audio Sampling Hardware

To successfully analyze an audio signal, it is important that the audio input we used was a clear one. We used an analog microphone module provided by the lab to sample audio. This breakout board contained an analog microphone connected to a differential gain amplifier with variable gain. The gain was controlled by a potentiometer. The module used 5V and GND inputs, and gave an analog amplified microphone output. We decided to increase the gain on the microphone as high as possible so that we could detect noise from as far away as possible. The breakout board output then went through a low pass filter with a cutoff frequency of about 2000 Hz. We were planning on using the audio for music identification and this is at the highest range of frequencies in most music. We then had our amplified and filtered microphone signal that we could now start to send to the Pi.

Unfortunately, the Raspberry Pi does not have any analog inputs, this meant that we could not send the microphone signal directly to the Pi. We used an Arduino Uno in place of a standard Analog-to-Digital Converter because we could simplify our circuit this way. The arduino was powered off of a 9V battery and took the microphone output as its only input. We then used the arduino analogRead() function to get a digital value representing the analog signal. The Arduino Uno has a 10-bit ADC which uses the Arduino Vcc or 5V as its reference voltage. This would give us digital values from 0 to 1023 representing 0 to 5V from the microphone. We then sent the values over the Arduino Tx serial line to the Pi at a baud rate of 19200.

We used a level shifter in between the Pi and the Arduino because our Pi GPIO pins needed a 3V3 signal and the Arduino serial output was at 5V. The level shifter dropped the voltage from 5V to 3.3V and then we used the designated Pi Rx pin to receive serial data. We enabled the serial hardware in the raspi-config menu and we could now receive the serial data through the /dev/ttyS0 port on the Raspberry Pi like any other serial data.
Motor Control Hardware

Our motor control hardware for the continuous servos was the same as for the robot we built in Lab 3. The motor controller took in a PWM signal for the speed of the two motors and two directional signals for the direction of each individual motor. It then outputs to the left and right motors as specified in the datasheet. It also took a 6V power input to drive the motor with, the GPIO signals from the Pi were isolated from the motors as to not cause any dangerous back EMF.

We added new hardware to control the standard servo we mounted on the top of our robot. This servo would serve to rotate the microphone around on a stick to the three cardinal directions. We powered the servo from the same 6V power source as the continuous servos, and set up a new GPIO pin on the Pi to provide the PWM control input.

Software

Serial

Our software serial module handles reading the serial from the arduino and handling it in a way that we can use. We used the pyserial library to set up the serial ports in python. We set up /dev/ttyS0 at 19200 baud in the program initialization. The serial data first comes as byte formatted data with a newline in between each character, so a serial read consists of reading a single line of the serial stream and converting it into a python integer. By doing this we keep the same range of values 0-1023 from the Arduino. We found that the pyserial library actually keeps serial data from ttyS0 in a buffer and when we call serial.readline() it reads from the pyserial buffer. Because it was important to us that we had the most current, live data, we clear the pyserial buffer before we read anything from serial. We decided that a constant N=256 values was a good amount of values to read each time so we saved 256 values into a global data array every time we performed a serial read.
Localization

Our software implements a localization module as one of the steps in our main program. The purpose of this step is to locate the source of the sound using the standard servo with the microphone on it. First we rotate the servo to the right, causing the microphone to face right. We then take a larger set of data samples ~1000 samples. We then determine the “loudness” of the sound in that direction and record it. We then rotate the mic to face forward and then left, performing the same measurement each time. Now, we can compare the three “loudness” values and pick the maximum. If the maximum volume was forward the robot will move forwards, navigating towards the sound. If the maximum was left or right, the robot will turn 90 degrees in that direction. After each movement action is taken, the robot repeats the steps again. If the robot turns left or right, then the next time it measures it should measure straight and continue forwards towards the sound. When the volume exceeds an experimentally set threshold volume, the robot will determine it is close enough to the sound and stop the localization part of the program.

To determine the “loudness” of the sound in a direction we used the following algorithm. First, take every value and subtract 528 from the value, because the signal is a wave, this is the “zero” volume value. We then take the absolute value so we have a set of distances away from 528. Then we sum the values and take an average, getting an average distance away from 528. This algorithm is subject to some inaccuracy due to sampling frequencies and the nature of waves. Because of how sampling rates work it is possible that one set of frequencies will be sampled, coincidentally around the peaks but another signal with higher peaks will be sampled at lower parts of the wave. For our purposes, with a large number of samples we found that we could consistently identify the direction of the sound so we were not worried about the theoretical inaccuracies.
Signal Analysis

The signal analysis module was contained in the serial_reader module, despite it being generally separate. This module handled the analysis of the analog microphone input, and the displaying of the input and the FFT to the piTFT. We used this module a lot at first just to test the serial reader and the audio hardware. We used the numpy and matplotlib libraries for number processing and graphing respectively. Initially, we used matplot lib to simply show a graph of the microphone input over time. This was to check that we were in fact getting a wave from the microphone and the serial data was good. To get a reasonable audio sample, you have to sample at at least twice the frequency of the signal you are reading. We were mainly focused on frequencies in music so we sampled at a frequency of 3000Hz on the Arduino. As stated above, we took 256 samples from serial each time we wanted to read a section of data. This was a conscious decision as you can perform an FFT most efficiently on sets of data with the size of a power of two.

The signal analysis module is relevant to our main program after the sound source has been located. We then enter the freerunning FFT mode. In this mode, the robot displays an FFT of the sound on the piTFT in real time and updates it as fast as it can. We used numpy first to generate an FFT with the 256 inputs and a given sampling rate of 3000 Hz, we then calculated a few more constants to create a nice graph of the FFT using matplotlib. The matplotlib library does a good job of refreshing graphs, however the matplotlib library would not let us use the piTFT as the display target for the graph. Our solution to this was to save the FFT to an image, output.png each time it was updated. We then used pygame to display this image on the piTFT immediately after the FFT graph was generated. This allowed us to display the FFT on the TFT as fast as the FFT could be performed. We chose 256 values in the data array because this gave us an accurate FFT while still generating fast enough to update live, even if at a slow framerate.
Driver

The Driver module contained a class that interfaced with every other module in the codebase. It also contained the state machine logic for the robot. The driver contained a Locator object and a SerialReader object, and also had a member variable to keep track of its state. We also created an Enum class called DriverState to keep track of the state. As stated before, we had a locating state and a displaying state. In the main function, we set up the GPIO pins, instantiate a SerialReader and a Locator and then use those to instantiate the Driver object. After that, we call Driver.step in a “while True” loop (or until the user manually exits the program), and the state machine transitions are handled internally as specified above. In the localization phase, once the volume of the audio sample reaches a certain threshold, we transition to the display phase, where we continually display and update the FFT of the audio samples taken by the microphone.

Testing

In the initial stages of the implementation, we used an oscilloscope and serial monitor to test the hardware. The low pass filter, arduino, and microphone were all able to be tested this way. We found a problem with our hardware causing strange serial data and found that the capacitor in the low pass filter was the incorrect value. Once we tested in the manner and found that the hardware was working, we could move on to the software with confidence that we would not be held back by hardware bugs.

The software was mostly tested by trial and observation. To test the localization code and algorithms we ran the code in a script that repeated the localization in a loop. We moved a sound source from place to place and checked that we had high consistency in our results before moving to test another module. We tested the FFT and serial modules by displaying the graphs on the monitor and examining them to ensure we were getting the expected value. We had some issues with the correct sampling frequency that we were able to resolve in this manner. Once we were confident in our FFT and sampling, we moved the FFT to the piTFT and had confidence in that part of the system.

Finally, we ran multiple trials of the driver code and ensured that the movement and state of the robot was all as expected. This gave us confidence in our system from the bottom up.

Results

Our team ran into many more unexpected challenges than we had anticipated when starting the assignment. Some of them were discussed shortly in the Design and Testing section but we learned some good lessons in approaching a problem. The major issues we had were with our materials not performing exactly how we had expected.

Firstly, we had problems with movement and servos. While the robot made the correct decisions for when and where to move, the continuous servos we were using to drive the robot were not strong enough to move the robot when it was on the ground. Had we tested much earlier we could have compensated for this with bigger servos and more power, but as it stood we were forced to demonstrate the movement as a proof of concept and show the wheels performing the correct actions when the robot was suspended off of the ground. We also had some issues with the standard servo that was driven entirely with PWM. We experienced that despite setting the angle with a specified software PWM signal like we had the continuous servos, the motor would jitter and not move smoothly or hold its position. We determined that this was because we used a software PWM signal from python which was subject to timing issues in linux due to context switching and process scheduling. In the future, we can use a hardware PWM signal and smooth out this servos performance.

We also had some unexpected problems with the audio sampling. Firstly, the microphone module we used despite having an amplifier, did not have a very far range. This made the concept of localizing a sound from far away much more difficult. The solution for a problem like this would be to get a better microphone, or an array of microphones built to detect further range sound. We decided to continue with the project anyways demonstrating the possibilities if we had a longer range microphone. Another issue we ran into was the sampling rate with the Arduino. We used the Arduino analogRead() function which is part of an arduino library to read from the ADC. Unfortunately, this does not read at a specific frequency and we were sort of left to guess the exact sampling frequency. In an ideal world or in the future of this project we could use the Arduino timer hardware to read the ADC on a timer interrupt. This would give us a very consistent and known sampling rate.

While we did run into these problems, we were still able to produce a final robot that demonstrated some pretty interesting possibilities. With slightly better hardware it is clear that a sound localization robot is a very possible project and could be extended to even more applications. We were happy with the results we saw and ended up getting consistent empirical results from everything we tested.

Demo

Conclusion

We were able to develop a good proof of concept for the idea we had in mind. Our robot was able to locate the source of an audio source and display the FFT of the audio source.
One minor issue we encountered was the power of the motors we were using. We found that the motion of the motors were heavily impeded as we added components to the robot such as breadboards and batteries.
The major issue we encountered throughout the project was being severely limited by the quality of the microphone we used. Since sound has a cubic falloff with respect to distance, the microphone was not able to detect sound accurately from more than a few inches away. In the localization step, if the audio source was far away from the microphone, the volumes in all 3 samples were too similar for the robot to make the correct choice. Additionally, when displaying the FFT, the audio signal that we were receiving from the microphone was quite noisy.

Future Work

Optimize the navigation mechanism so the locating function can be more effective
Optimize the sound collecting hardware so it can sense sound from a further distance
Analyze actual songs and search songs info in the online database
Display the frequency number on the screen or correspond the frequency to the music notes
Let the robot move back to the starting location then display the analysis

Code Appendix

driver.py

from enum import Enum
from movement.movement import Movement, Speed
from serial_reader.serial_reader import SerialReader
from movement.locator import Locator, LocatorTheta
from utils.gpio_setup import InputPinData, OutputPinData, PinInitException, setupInputs, setupOutputs, setupPwm
import RPi.GPIO as GPIO

LEFT_PINS = (5, 6)
RIGHT_PINS = (20, 21)
LEFT_PWM, RIGHT_PWM, SERVO_PWM = 26, 16, 23
PWM_FREQUENCY = 50

def setupPins():
    inputPins = []
    outputPins = [
        OutputPinData(LEFT_PINS[0], GPIO.LOW),
        OutputPinData(LEFT_PINS[1], GPIO.LOW),
        OutputPinData(RIGHT_PINS[0], GPIO.LOW),
        OutputPinData(RIGHT_PINS[1], GPIO.LOW),
        OutputPinData(LEFT_PWM),
        OutputPinData(RIGHT_PWM),
        OutputPinData(SERVO_PWM)
    ]
    try:
        setupOutputs(outputPins, True)
        setupInputs(inputPins, False)
    except Exception as e:
        print("error when setting up pins")
        exit(-1)

class DriverState(Enum):
    Searching = 0
    Recognizing = 1

class Driver():
    __curr_state: DriverState
    __locator: Locator
    __serial_reader: SerialReader

    def __init__(self, locator : Locator, serial_reader : SerialReader) -> None:
        self.__curr_state = DriverState.Searching
        self.__locator = locator
        self.__serial_reader = serial_reader

    def __locate_step(self):
        while not self.__locator.locate():
            pass
        self.__curr_state = DriverState.Recognizing

    def __display_step(self):
        while True:
            self.__serial_reader.fft_update()

    def step(self):
        state_to_handler = {
            DriverState.Searching: self.__locate_step,
            DriverState.Recognizing: self.__display_step
        }
        state_to_handler[self.__curr_state]()

try:
    setupPins()
    movementController = Movement(
        PWM_FREQUENCY,
        (setupPwm(LEFT_PWM, PWM_FREQUENCY), setupPwm(RIGHT_PWM, PWM_FREQUENCY)),
        LEFT_PINS,
        RIGHT_PINS,
        Speed.FULL
    )
    locatorClient = Locator(
        setupPwm(SERVO_PWM, PWM_FREQUENCY),
        PWM_FREQUENCY,
        SerialReader(),
        movementController)

    

except Exception as e:
    GPIO.cleanup()
    raise e
finally:
    GPIO.cleanup()

locator.py

from enum import Enum
import RPi._GPIO as GPIO
from time import sleep
from movement.movement import Movement
from serial_reader.serial_reader import SerialReader

class LocatorTheta(Enum):
    CW = 3.5
    HALT = 7.5
    CCW = 11.5

class Locator():
    __pwm_pin: GPIO.PWM
    __serial_reader: SerialReader
    __movement_controller: Movement
    __frequency: int

    def __init__(self, pwm : GPIO.PWM,
                 freq : int,
                 serial_reader : SerialReader,
                 movement_controller : Movement):
        self.__frequency = freq
        self.__pwm_pin = pwm
        self.__pwm_pin.start(LocatorTheta.HALT.value)
        self.__movement_controller = movement_controller
        self.__serial_reader = serial_reader

    def rotate(self, theta : LocatorTheta) -> None:
        self.__duty_cycle = theta
        self.__pwm_pin.ChangeDutyCycle(theta.value)
    
    def locate(self) -> bool:
        SERVO_TURN_DURATION = 2.5
        CAR_TURN_DURATION = 0.5
        CAR_FORWARD_DURATION = 1.0
        DECIBEL_THRESHOLD = 1000000.

        # TODO: If the decibel value is above a certain threshold then we can just stop early
        self.rotate(LocatorTheta.CCW)
        sleep(SERVO_TURN_DURATION)
        self.__serial_reader.fft_update()
        left_db_value = self.__serial_reader.calculate_avg_amplitude()
        self.rotate(LocatorTheta.HALT)
        sleep(SERVO_TURN_DURATION)
        self.__serial_reader.fft_update()
        middle_db_value = self.__serial_reader.calculate_avg_amplitude()
        self.rotate(LocatorTheta.CW)
        sleep(SERVO_TURN_DURATION)
        self.__serial_reader.fft_update()
        right_db_value = self.__serial_reader.calculate_avg_amplitude()
        print(str(left_db_value) + ", " + str(middle_db_value) + ", " + str(right_db_value))
        max_db_value = max(left_db_value, middle_db_value, right_db_value)

        if max_db_value > DECIBEL_THRESHOLD:
            return True
        else:
            if left_db_value == max_db_value:
                self.__movement_controller.turn_left(CAR_TURN_DURATION)
                print("Turning left")
                return False
            elif middle_db_value == max_db_value:
                self.__movement_controller.forward(CAR_FORWARD_DURATION)
                print("Moving straight")
                return False
            else:
                self.__movement_controller.turn_right(CAR_TURN_DURATION)
                print("Turning right")
                return False

movement.py

from dataclasses import dataclass
from enum import Enum
from time import sleep
from typing import Any
import RPi._GPIO as GPIO

'''
The movement class assumes that the relevant pins have been set up using
the functions in ../gpio_setup.py. The relevant PWMs also have to be set up
'''

class _WheelDirection(Enum):
    STOP = (False, False)
    CW = (True, False)
    CCW = (False, True)

class Speed(Enum):
    STOP = 0.0
    HALF = 50.0
    FULL = 100.0

@dataclass
class _MovementPins():
    pin1: int
    pin2: int

class Movement():
    __frequency: int
    __duty_cycle: Speed
    __pwm_pins: tuple[GPIO.PWM, GPIO.PWM]
    __left_pins: _MovementPins
    __right_pins: _MovementPins

    def __init__(self, freq : int,
                 pwms : tuple[GPIO.PWM, GPIO.PWM],
                 left_pins : tuple[int, int],
                 right_pins: tuple[int, int],
                 duty_cycle : Speed = Speed.STOP):
        self.__frequency = freq
        self.__duty_cycle = duty_cycle
        self.__pwm_pins = pwms
        self.__left_pins = _MovementPins(*left_pins)
        self.__right_pins = _MovementPins(*right_pins)
        self.__pwm_pins[0].start(duty_cycle.value)
        self.__pwm_pins[1].start(duty_cycle.value)

    def __set_speed(self, speed : Speed) -> None:
        self.__duty_cycle = speed
        self.__pwm_pins[0].ChangeDutyCycle(speed.value)
        self.__pwm_pins[1].ChangeDutyCycle(speed.value)

    def __set_wheel(self, pins : _MovementPins, dir : _WheelDirection) -> None:
        val1, val2 = dir.value
        if val1:
            GPIO.output(pins.pin1, GPIO.HIGH)
        else:
            GPIO.output(pins.pin1, GPIO.LOW)
        if val2:
            GPIO.output(pins.pin2, GPIO.HIGH)
        else:
            GPIO.output(pins.pin2, GPIO.LOW)

    def __set_wheels(self, left : _WheelDirection, right : _WheelDirection) -> None:
        self.__set_wheel(self.__left_pins, left)
        self.__set_wheel(self.__right_pins, right)

    def stop(self) -> None:
        self.__set_wheels(_WheelDirection.STOP, _WheelDirection.STOP)

    def turn_left(self, duration : float) -> None:
        self.__set_wheels(_WheelDirection.CCW, _WheelDirection.CCW)
        sleep(duration)
        self.stop()

    def turn_right(self, duration : float) -> None:
        self.__set_wheels(_WheelDirection.CW, _WheelDirection.CW)
        sleep(duration)
        self.stop()

    def forward(self, duration : float) -> None:
        self.__set_wheels(_WheelDirection.CW, _WheelDirection.CCW)
        sleep(duration)
        self.stop()

    def backwards(self, duration : float) -> None:
        self.__set_wheels(_WheelDirection.CCW, _WheelDirection.CW)
        sleep(duration)
        self.stop()

serial_reader.py

from typing import Any
	import serial
	import matplotlib.pyplot as plot
	import numpy as np
	import pygame, os
	
	SERIAL_PORT = '/dev/ttyS0'
	ser = serial.Serial(SERIAL_PORT, 19200)
	
	def read_char_as_int():
		ret = ser.readline()
		try:
			return int(ret)
		except ValueError:
			return 528
	
	
	class SerialReader():
		screen : Any
		data : list
		N : int
		
		def __init__(self) -> None:
					
			os.putenv('SDL_VIDEODRIVER', 'fbcon')
			os.putenv('SDL_FBDEV', '/dev/fb1')
			self.data = []
			self.N = 256
			pygame.init()
			size = width, height = 320, 240
			self.screen = pygame.display.set_mode(size)
			plot.ion()
			plot.show()
			plot.figure(figsize=(4,3), dpi=80)
	
		def read_data(self):
			self.data = []
			nTicks = 0
			while nTicks < self.N:
				nTicks += 1
				val = read_char_as_int()
				self.data.append(val)
			return self.data
	
		def calculate_avg_amplitude(self) -> float:
			amp_data = []
			for i in range(3):
				amp_data = amp_data + self.read_data()
			sum = 0
			for x in amp_data:
				x = abs(x - 528)
				sum += x
			avg = sum / len(amp_data)
			return avg
		
		def graph_fft(self, data):
			X = np.fft.fft(self.data)
			n = np.arange(self.N)
			T = self.N / ( 1 / .00033 )
			freq = n / T
			plot.cla()
			plot.xlabel('Freq (Hz)')
			plot.ylabel('Amplitude')
			plot.xlim(0, 2000)
			plot.ylim(-10, 5000)
			plot.stem(freq, np.abs(X), linefmt ='b', markerfmt=" ")
			plot.savefig("/fft.png")
			fftimg = pygame.image.load("/fft.png")
			fftrect = fftimg.get_rect()
			self.screen.fill((0,0,0))
			self.screen.blit(fftimg, fftrect)
			pygame.display.flip()
			
		def fft_update(self):
			data = self.read_data()
			self.graph_fft(data)
			plot.pause(.01)

gpio_setup.py

from dataclasses import dataclass
from typing import Any, Callable
import RPi._GPIO as GPIO

done_setmode = False

@dataclass
class InputPinData:
    pin_num: int
    pull_dir: int
    bouncetime: int = None
    callback: Callable[..., Any] = None

@dataclass
class OutputPinData:
    pin_num: int
    output_param: int = None

class PinInitException(Exception):
    pin_num: int
    message: str

    def __init__(self, num, message):
        self.pin_num = num
        self.message = message
        super.__init__(self.message)

def _setMode(set_mode : bool):
    global done_setmode
    if set_mode:
        GPIO.setmode(GPIO.BCM)
        done_setmode = True
    if not done_setmode:
        raise PinInitException(-1, "Setmode not done before pin setup")

def setupInputs(pins : list[InputPinData], set_mode : bool):
    _setMode(set_mode)
    for pin in pins:
        GPIO.setup(pin.pin_num, GPIO.IN, pull_up_down=pin.pull_dir)
        if pin.callback is not None:
            if pin.bouncetime is None:
                raise PinInitException(pin.pin_num, f"Callback specified without bouncetime in pin {pin.pin_num}")
            GPIO.add_event_detect(
                pin.pin_num,
                GPIO.FALLING if pin.pull_dir == GPIO.PUD_UP else GPIO.RISING,
                callback=pin.callback,
                bouncetime=pin.bouncetime
            )

def setupOutputs(pins : list[OutputPinData], set_mode : bool):
    _setMode(set_mode)
    for pin in pins:
        # TODO: make sure this is the right order for GPIO setup
        GPIO.setup(pin.pin_num, GPIO.OUT) # set up an output pin
        if pin.output_param is not None:
            GPIO.output(pin.pin_num, pin.output_param)

def setupPwm(pin : int, frequency : int) -> GPIO.PWM:
    try:
        if not done_setmode:
            raise PinInitException(-1, "Setmode not done before PWM init")
        return GPIO.PWM(pin, frequency)
    except RuntimeError:
        raise PinInitException(pin, f"Was not able to setup PWM on pin {pin}. Ensure pin has been set up")

Work Distribution

Adith Ramachandran (ar729)

ar729@cornell.edu

Software design, Mobilization and navigation programming
Elias Hanna (eph48)

eph48@cornell.edu

Software and Hardware design, Arduino programming and circuit assembly, signal processing programming
Annie Chang (ac2329)

ac2329@cornell.edu

Hardware design, Mechanical design and circuit assembly

Shazam-Bot

Adith Ramachandran (ar729), Elias Hanna (eph48), Annie Chang (ac2329)

Introduction

Objective

Design

Hardware

Audio Sampling Hardware

Motor Control Hardware

Software

Serial

Localization

Signal Analysis

Driver